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INTRODUCTION 


This  project  is  to  develop  a  robust  computer  aided  diagnosis  (CAD)  system  for  mass  detection 
with  high  sensitivity  and  specificity  in  digitized  mammograms.  As  listed  in  the  Statement  of 
Work,  the  research  scope  in  the  third  year  of  project  is  to  generate  databases  and  use  them  for 
detection  performance  and  robustness  evaluation.  The  evaluation  strategy  taken  in  this  research 
is  to  compare  the  detection  sensitivity/specificity  and  the  dependency  of  detection  performance 
on  database  selection  between  old  and  newly  developed  CAD  algorithm  in  this  project  by  using 
FROC  analysis. 


BODY 

Objective  h  to  have  typical  databases  for  different  evaluation  purpose. 

Accomplishments: 

Two  databases  were  generated  in  this  evaluation  study. 

Database  I  for  detection  performance  evaluation: 

The  mammograms  in  this  database  were  originally  digitized  by  a  DBA  digitizer  at  60 
pm  and  16  bit  gray  scale.  Because  the  new  algorithm  proposed  in  this  project  was  developed 
on  Lumisys  data,  a  mapping  from  DBA  data  format  to  Lumisys  data  format  was  taken 
before  the  evaluation  of  new  algorithm.  The  database  used  in  this  FROC  study  consists  of 
three  datasets:  106  negative  cases,  50  benign  cases  and  58  minimal  cancer  cases.  Among  the 
50  benign  cases,  32  cases  are  abnormal  in  terms  of  mass.  39  out  of  58  minimal  cancer  cases 
are  with  masses. 

Database  II  for  detection  robustness  evaluation: 

This  database  are  digitized  by  a  Lumisys  digitizer  at  60  pm  resolution  with  15  bits  gray 
scale.  They  are  subsampled  by  a  factor  of  3  to  reduce  the  image  size  for  mass  detection, 
which  approximately  corresponds  to  180  pm  in  spatial  resolution.  Because  the  detection 
algorithm  before  the  modification  of  this  project’s  research  was  developed  for  DBA  data,  a 
mapping  from  Lumisys  data  format  to  DBA  data  format  was  taken  before  the  evaluation  of 
old  algorithm  could  be  made.  This  database  was  randomly  split  into  two  datasets  for  the 
testing  of  algorithm  generalizability.  The  first  dataset  used  in  this  FROC  study  consists  of 
102  cases,  in  which  31  cases  are  abnormal  in  terms  of  masses.  The  second  dataset  has  96 
cases  and  29  of  them  are  abnormal.  Among  the  abnormal  cases,  61%  the  abnormal  cases  are 
benign  and  39%  are  malignant  in  dataset  I  as  compared  to  18%  benign  and  82%  malignant 
in  dataset  II. 
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Objective  2:  to  evaluate  the  detection  performance  improvement  of  CAD  system. 

Accomplishments : 

The  evaluation  of  both  algorithms  were  taken  on  benign,  minimal  cancer  and  normal  case 
datasets  respectively.  The  FROC  curves  of  case  detection  sensitivity  versus  false-positive 
signals  are  shown  in  the  Figure  1  and  Figure  2.  The  consistence  of  detection  performance 
improvement  of  the  algorithm  developed  in  this  project  compared  to  the  old  algorithm  we 
developed  before  is  observed.  The  false  positive  rate  of  two  algorithms  on  negative  dataset 
was  also  obtained  as  listed  in  the  Table  1  and  Table  2.  It  is  noticed  that  new  algorithm 
generated  much  less  false  positive  signals. 


Figure  1.  FROC  curves  on  benign  dataset  in  Database  I,  where  V1-B  and 
V2-B  are  the  results  of  old  and  new  detection  algorithms  respectively. 
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Figure  2.  FROC  Curves  on  minimal  cancer  cases  in  Database  I,  where 
VI -M  and  V2-M  are  the  results  of  old  and  new  algorithm  respectively. 


Table  I.  FPs  of  old  algorithm  on  Negative  Cases 
at  five  different  working  points. 
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Table  2.  FPs  of  new  algorithm  on  Negative  Cases 
at  five  different  working  points. 
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Objective  3:  to  evaluate  the  robustness  of  CAD  system 
Accomplishments! 

To  evaluate  the  improvement  of  detection  robustness,  two  generation  algorithms  were  tested 
and  compared  on  the  two  datasets  in  Database  II,  in  which  the  first  dataset  used  in  this 
FROC  study  consists  of  102  cases,  where  31  cases  are  abnormal  in  terms  of  masses;  the 
second  dataset  has  96  cases  and  29  of  them  are  abnormal.  Among  the  abnormal  cases,  61% 
the  abnormal  cases  are  benign  and  39%  are  malignant  in  dataset  I  as  compared  to  18% 
benign  and  82%  malignant  in  dataset  II.  The  FROC  curves  of  detection  results  by  using  old 
and  newly  developed  algorithms  in  this  project  are  shown  in  Figure  3.  It  is  observed  that  the 
detection  performance  of  both  algorithms  drops  on  the  second  dataset  compared  to  the  first 
dataset.  For  example,  the  case  detection  sensitivity  of  first  generation  algorithm  drops  from 
84%  at  3.32  false-positive  signals  per  image  to  72%  at  3.41  FPs  per  image.  The  performance 
also  drops  for  the  new  algorithm  from  87%  at  2.05  FPs  to  86%  at  2.20  FPs.  On  the  overall, 
the  following  facts  were  observed,  (1)  the  big  improvement  of  detection  performance  of 
newly  developed  algorithm  over  the  old  algorithm  is  consistently  obtained  on  both  datasets 
and  as  that  shown  in  the  testing  on  Database  I.  To  some  extent,  the  improvement  is  even 
bigger  for  second  dataset;  (2)  the  new  algorithm  has  a  less  drop  in  performance  compared  to 
the  old  algorithm,  i.e.  the  new  algorithm  demonstrates  a  better  robustness  to  file  variation  of 
database  characteristics.  By  reviewing  the  mammogram  dataset  and  its  detection  results,  it  is 
found  that  the  major  cause  of  detection  performance  drop  is  the  cases  in  second  dataset  are 
more  difficult  than  that  of  first  dataset,  where  there  are  82%  cases  are  malignant  as  opposed 
to  only  39%  cases  are  malignant  in  first  half  data. 
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Figure  3.  FROC  curves  of  detection  by  using  old  and  new  algorithms 
on  two  independent  datasets. 
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KEY  RESEARCH  ACCOMPLISHMENTS 

1 .  Two  databases  were  generated  for  CAD  system  evaluation. 

2.  An  evaluation  of  detection  performance  was  taken.  A  big  improvement  was  obtained  by 
using  the  new  methods  developed  in  this  project. 

3.  An  evaluation  of  CAD  system  robustness  was  taken.  It  is  observed  that  the  new  CAD 
system  developed  in  this  project  has  a  much  better  detection  generlizability. 
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CONCLUSIONS 

The  great  variation  of  characteristics  of  mammograms  and  masses  hinders  us  in  developing  a 
high  detection  performance  and  more  generalizable  CAD  system.  The  typical  variations  between 
different  mammograms  result  either  from  the  imaging  process  (such  as  film  exposure,  film 
label),  digitization  process  (such  as  spatial  /  intensity  resolution,  response  function  to  optical 
density),  or  most  importantly  the  inherent  breast  tissue  characteristics.  The  variations  of  masses 
include  its  size,  contrast,  shape,  location,  intensity  pattern  and  its  relation  to  the  surrounding 
tissues.  The  research  work  taken  in  third  year  of  this  project  is  to  generate  databases  and  use 
them  for  detection  evaluation  of  CAD  system  in  terms  of  performance  improvement  and 
generalizability.  The  evaluation  results  demonstrated  that  the  algorithm  developed  in  this  project 
is  much  better  than  our  old  (first  generation)  algorithm. 
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